Rapid species discrimination of similar insects using hyperspectral imaging and lightweight edge artificial intelligence

Species discrimination of insects is an important aspect of ecology and biodiversity research. The traditional methods based on human visual experience and biochemical analysis cannot strike a balance between accuracy and timeliness. Morphological identification using computer vision and machine learning is expected to solve this problem, but image features have poor accuracy for very similar species and usually require complicated networks that are unfriendly to portable edge devices. In this work, we propose a fast and accurate species discrimination method of similar insects using hyperspectral features and lightweight machine learning algorithm. Feature regions selection, feature spectra selection and model quantification are used for the optimization of discriminating network. The experimental results of six similar butterfly species in the genus of Graphium show that, compared with morphological recognition with machine vision, our work achieves a higher accuracy of 92.36 ± 3.04% and a shorter inference time of 0.6 ms, with the tiny-size convolutional neural network deployed on a neural network chip. This study provides a rapid and high-accuracy species discrimination method for insects with high appearance similarity and paves the way for field discriminations using intelligent micro-spectrometer based on on-chip microstructure and artificial intelligence chip.


Introduction
Insects are the largest group of animals on Earth, with a wide variety and diverse forms [1].The discrimination of insect species is an important aspect of ecology and biodiversity research [2].Traditional identification methods of human visual perception primarily rely on the subjective experience of ecologists, resulting in low accuracy rate when distinguishing between similar genus or species [3,4].Biochemical analysis techniques, such as DNA and rRNA gene sequencing, can provide high identification accuracy, but they are hindered by complex procedures and poor timeliness, owing to the dependence on laboratory equipment [5].Thus, there is an urgent need for a species discrimination method that can balance both accuracy and timeliness.
With the rapid development of computer vision and digital image processing technology, the methods for automatic recognition of insect species based on image features were proposed and popular in the past few decades [3].The geometric features, wing shape, wing colour and surface texture of insects have been used as key feature in the research of species classification [3,6].With traditional statistical methods, researchers proposed the concept of branch length similarity entropy to characterize wing shape features and the concept of grey-level co-occurrence matrix to extract colour and texture features, in the classification of butterflies [6,7].In recent years, deep learning has extended the species recognition based on image features to more scenarios with a better accuracy, owing to its ability to perform automatic feature extraction [8,9].Carvajal et al. used three pre-trained convolutional neural network (CNN) models on the ImageNet dataset to identify 19 different species of Lepidoptera insects with a recognition accuracy of over 92% [2].Another research compared the classification performance of 11 deep neural networks on the Indian butterfly dataset and achieved the highest accuracy of 94.44% using ResNet-152 [10].However, most of the datasets used in these studies are species with significant differences in appearance [2,4,6,11].For species with high phenotypic similarity, the feature differences extracted by visual models are not obvious, and such differences are often complicated by age, population and seasonal variations, which are easily misclassified [10,12].
Hyperspectral imaging (HSI) technology can simultaneously obtain the geometric appearance and spectral features of targets with the advantages of high efficiency, timeliness and non-destructive detection [13,14].Nowadays, by combining with deep learning, HSI has been successfully applied in many fields of scientific research and industrial production, including the species identification of insects in ecological study [15][16][17].Compared with machine vision and image recognition methods, HSI usually has a higher recognition accuracy owing to the increased dimension of spectral sensing.Some studies have used this technique to study the wing interference patterns and structural colours of butterfly and dragonfly [18][19][20].Takahashi evaluated the interspecific differences among 12 fruit flies using the average reflectance spectra of hyperspectral images [21].Furthermore, some researchers developed a system that can efficiently capture multi-spectral images of Lepidoptera, which will help compare the differences between different species at different classification scales [22].This research demonstrated the effectiveness of hyperspectral images in interspecies classification and differences in insects evaluation.However, unfortunately, owing to the complex algorithms and large amount of data, the HSI data pretreatment and analysis models are mostly deployed on external computers and physically separated from the hardware of hyperspectral equipment.This separating mode is clearly not a boon to field applications of species discrimination.Therefore, a lightweight HSI artificial intelligence (AI) discriminating model that can be deployed on edge devices, instead of the computing centre, has significant value in ecological and biodiversity research.
In this work, we propose a rapid and accurate species discrimination method for similar insects using HSI and lightweight AI algorithm.As experimental demonstration, six butterfly species with high similarity in the genus of Graphium are discriminated by a miniaturized deep learning model, which is deployed on a neural network processing unit (NPU) chip.Feature regions selection, feature spectra selection and model quantification are used for the compression and optimization of discrimination algorithm.The experimental results show that the lightweight network extracted from hyperspectral features achieves higher accuracy with less complexity compared with morphological recognition with machine vision.This study provides a species discrimination method for similar 2 royalsocietypublishing.org/journal/rsos R. Soc.Open Sci.11: 240485 insects with both accuracy and timeliness and paves the way for miniaturized and field-use instrument with on-chip integration of spectroscopic microstructure and AI chip.

Methodology
The method we proposed is based on a prior law that there is a large amount of data redundancy in hyperspectral images, including both spatial and spectral dimensions [23].Spatial redundancy comes from the irrelevant and non-feature regions of samples.Spectral redundancy comes from two aspects: one is the high correlation commonly existing between adjacent spectral bands and another is the non-feature spectral bands for specific species, which have no value to analysis model.Thus, the selections based on regions of interest (ROI) and bands of interest (BOI) for specific species are used as prior knowledge for the simplification of hyperspectral data cube, while the network quantification is used for the compression of on-chip discriminating model.
As shown in figure 1, the proposed method of species discrimination consists of six steps.

Sample preparation
To remove the influence caused by sample preprocessing, all samples in this work are specimens prepared using the same and standard method.

Hyperspectral imaging data acquisition
The aim of HSI data acquisition is to get reflectivity data cube, and the operations are completely the same as in traditional hyperspectral applications.

Regions of interest selection
To remove the irrelevant and non-feature regions of samples from hyperspectral data cubes, the preprocessing selection of ROIs is used to reduce the data amount.As one of the most important organs, wings play an unparalleled role in species identification of insects.Thus, we propose to select ROIs on wings for further feature extraction and classification.Two principles are mainly considered during the selection of ROI.On the one hand, these areas must be located in the middle of the wings to minimize the impact of sample defects and contamination, while on the other hand, they must have clear spectral features and high reflectivity to ensure classification effectiveness.

Bands of interest selection
We firstly use the Savitzky-Golay (SG) method to smooth the spectra, in order to reduce the noise while preserving the shape and width of the curve [24,25].And then, the competitive adaptive reweighted sampling (CARS) algorithm is used for the final feature bands extraction [26,27].Adaptive reweighted sampling was employed to reduce wavelength number in a competitive manner, and the root mean squared error of cross-validation (RMSECV) is calculated as the criterion of optimal feature extraction.In this way, a key-spectra dataset was acquired from the full-spectra dataset, with the same number of spectra and less spectral bands.

Model optimization
We adopt the deep learning model based on CNN to extract sample features for species classification.
In the model training stage, the dataset of spectra after ROI and BOI selections is used for model building.In an effort to deploy AI model on a NPU chip for on-the-spot inference and prediction, model quantification and conversion are made for the compression of network at the edge.

Real-time discrimination
Field identifications are executed using on-chip discriminating model.

Experimental material
Butterflies (Lepidoptera: Rhopalocera) are the second largest group of insects, with approximately 17 000 recorded species on Earth [28].The species classification and identification of butterflies have been a significant area of research in the field of entomological taxonomy for a long time [3].In this work, six butterfly species in the genus of Graphium are selected for experiments, including G. sarpedon, G. milon, G. doson, G. chironides, G. eurypylus, and G. meyeri.The total number of samples is 140, including 10 samples from G. meyeri and 26 samples from each other butterfly species.All the specimens were made using standard method including softening, pinning, stretching wings and drying [29,30].As shown in figure 2, the boundary shape, spot distribution and colour of the Graphium samples all have high similarity.Therefore, fast discrimination between some species of the samples is especially difficult even for professional scientists [5].In this article, we use the class names of the six butterflies as Sarpedon, Milon, Doson, Chironides, Eurypylus and Meyeri for short.

Hyperspectral images acquisition
The HSI system used in this work, as shown in figure 3, consists of three parts: a portable hyperspectral camera mounted on a tripod, an illumination system with two 43 W halogen lamps and a computer for data acquisition.Considering the possibility of field use, we chose a hand-held hyperspectral camera, Specim IQ (Specim Spectral Imaging Ltd), with a wavelength range of 400-1000 nm and a spectral resolution of 7 nm.It is a line scan camera based on push-broom technology with 204 spectral bands.The Specim IQ Studio software is installed on the computer to export hyperspectral data and manage camera settings.
To minimize light scattering errors, the specimens of butterfly and standard white reference (WR) were placed flat on a black background with low reflectance.The camera lens was oriented vertically downward, with a distance of 26 cm from the butterfly specimen.Clear hyperspectral images of the samples were obtained by adjusting a suitable focal length and exposure time; for this work, the exposure time is 20 ms.The resulting hyperspectral images, aside from the actual raw image data of the target area, were further processed to yield reflectance spectral data of every pixel.The reflectivity of the target pixel was acquired according to the equation where S Pixel and S WR are the raw signals of the target pixel and WR region, respectively, and S DBG is the dark signal of the camera to the dark background (DBG).

Discriminating model
With the advantages of high efficiency in adaptive feature learning, CNN has been successfully applied in classification and identification of machine vision and HSI application scenarios [2].In this work, a discriminating model based on CNN structure is trained, compressed and deployed on the NPU chip of RK3588 for rapid identification.
Firstly, we established a dataset containing 1400 spectral curves using selected spectra by ROI and BOI.According to the ratio of 4 : 1 : 1, all samples of the six species were divided into training set, validation set and test set.In particular, the spectra of the same sample cannot appear in both the training set and testing set simultaneously.In order to evaluate the performance and generalization ability of model accurately, we used the stratified fivefold cross-validation method to process and test the dataset.This method randomly divides the entire dataset into five mutually exclusive subsets and ensures consistency in the proportion of samples for each category in each subset.By sequentially selecting each subset as the test set, we ensure that all data will be used as the test set for model evaluation.the RGB image of six species [31,32].The image dataset was captured using the RGB lens integrated with SpecimIQ, and supplemental samples were generated using data augmentation strategy with geometric and photometric transformations such as rotation, zooming, shifting and colour inversion.In this study, the data augmentation is actualized based on the imgaug library of Python.We obtained a total of 1400 RGB images for all six species, which are also divided into training set, validation set and test set in the same ratio of 4 : 1 : 1.Similarly, the pictures of the same sample cannot appear in both the training set and testing set simultaneously.
Finally, the classification model trained by key-spectra dataset is deployed to the RK3588 series AI chip, which has triple NPU cores with up to six tera operations per second (TOPS) computing power.To further reduce the scale of on-chip inference model, model transformation and quantification were made by the tool RKNN-Toolkit2 1.4.0 installed on Ubuntu 20.04.For binary computing, the data type of INT8 has higher throughput and lower memory requirements than FP32.Our goal of hardware acceleration was to convert models from FP32 to INT8 without significant accuracy loss.Finally, the compressed and optimized RKNN model was loaded for inference by the tool RKNN-Toolkit-Lite2 1.4.0 running on the chip of RK3588.

Model evaluation
We evaluated the model from both performance and complexity dimensions.The performance evaluation metrics include precision (Pr), recall (Re) and F1-score (F1), as illustrated in the following equations [33][34][35]:

Data preprocessing and feature extraction
The three-dimensional data cube of the spectral image is shown in figure 4a, with a size of 512 pixels × 512 pixels × 204 bands.The x-axis and y-axis are two spatial dimensions, while z-axis is the spectral dimension.The regions of the sample and WR were captured into the same data cube, and the spectrum of each pixel was calculated automatically with formula (3.1) by SpecimIQ.Considering that both spectral ends of the camera have a lower signal-to-noise ratio, only the spectral range of 420-990 nm was used in this work, including 191 spectral bands [36].
As shown in figure 4b, six species have the same distribution of wing veins and compartments, which were marked in detail.We selected several compartments randomly in the centre of wings for spectra comparison and found the same phenomenon in all species.For a specific species, the spectra in spot position and the black background of wings are significantly different, while the spectra of the same pattern in different compartments are completely identical.As shown in figure 4c, this phenomenon is illustrated with the sample of Milon.Owing to the low reflectance of all the samples, it is clear that the spectra of the spot region have more feature information and higher signal quality.
Based on the above analysis, we finally chose the spot region in compartments of r5, m1, m2, m3 and cu1 for both left and right wings as ROIs.We select 3 × 3 pixels for every ROI and take the average as an effective spectrum.Therefore, a total number of 1400 spectra were obtained as our full-spectra dataset for all samples, with 10 spectra for every sample.
Before the feature selection of spectral bands, SG smooth was made to full-spectra dataset, with window points of 5 and a polynomial order of 2 [25].We named it the SG-full-spectra dataset.The spectral curves before and after smoothing are plotted separately in figure 5a,b.We can notice that the SG smoothing operation reduces the noise of spectral curves while obviously preserving the original spectral characteristics.
Feature wavelengths were extracted by the CARS algorithm, with a sampling rate of 0.8 [26].We set the number of sampling runs to 50.As the number of sampling runs increases, figure 6a,b illustrates the changes in effective feature wavelengths and RMSECV values.Owing to the existence of an exponentially decreasing function, the wavelength variable initially decreases rapidly and gradually slows down later.The RMSECV values decrease first and then continue to increase.When the number of sampling runs is 16, the RMSECV value reaches the minimum, indicating that the model has reached its optimal state.Figure 6c displays the regression coefficient path for each wavelength, with a blue asterisk line marking the sampling run where the minimum RMSECV value occurs.According to the 'survival of the fittest' principle, a larger absolute value of the regression coefficient corresponds to a stronger predictive ability.We ultimately chose 47 effective feature wavelengths.As shown in figure 6d, the selected feature wavelengths are indicated with grey dashed lines in the average spectral curves of six butterfly species.Thus, we acquired a key-spectra dataset with 47 featured bands from the full-spectra dataset.

Model training results
As shown in figure 7a, we designed a one-dimensional CNN with 12 layers called 1D-CNN-12, including input layer, convolution layers, pooling layers, full connection layer and output layer [37].The input of 1D-CNN-12 is the spectra with dimensions of (47,1) for key spectra or (191,1) for full spectra.The convolution kernels used in convolution layers C1, C2, C4, C5, C7 and C8 are 4 (with a size of 2), 8 (with a size of 4), 16 (with a size of 3), 32 (with a size of 2), 64 (with a size of 3) and 64 (with a size of 2).The activation function is tanh.Maximum pooling is used in pooling layers S3 (with a size of 2), S6 (with a size of 4) and S9 (with a size of 2).Both convolution layers and pooling layers are set up with padding and a stride of 1.After flattening, the output of 1D-CNN-12 is implemented through a full connection layer with an activation function of softmax.We adopted the dropout method with a rate of 0.3 to prevent over-fitting during the training phase.The epochs are set to 300 with a batch size of 8.
According to the stratified fivefold cross-validation method, we obtained the average accuracy, standard deviation and confusion matrix for five evaluations of different models.The discriminating result of 1D-CNN-12 based on the SG-full-spectra dataset and the key-spectra dataset are shown in figure 7b,c Chironides is 0.90 and 0.81, respectively, but still not good enough.This indicates that it is difficult to accurately distinguish similar species barely relying on image features.By comparison, the spectral classification model presents impressive results.It performs well in the classification of Eurypylus and Meyeri, with precision of 0.86 and 0.8, respectively.The accuracy of 92.36 ± 3.04% shows the enormous potential of spectral information in species classifications.

Model compression and on-chip discrimination
To demonstrate the feasibility of the proposed method for real-time species identification in the field applications, the model of 1D-CNN-12 was transformed and deployed on the NPU chip of RK3588, as shown in figure 8a.We compared the consistency between the deployed network which was on the chip and the original network which was on the computer.The cosine similarity was used to indicate the effect of model compression and quantization, which can represent the conversion error of each layer and the cumulative error of all layers.Figure 8b shows the cosine similarity results of both entire model and every single layer, demonstrating that most layers in our conversion have fantastic accuracy with the cosine similarity better than 0.999.Two convolutional operators have slightly lower accuracy, which are 0.992 and 0.978, but the entire cosine similarity is still 1, owing to the correction and adjustment during the model conversion.
With the test set of SG-full-spectra as input, we completed the actual on-chip inference and the discrimination accuracy was 92.36%, which coincides with the result on computer.The inference time of a single spectrum is 0.6 ms with the NPU operating frequency of 1000 MHz.This running result also indicates the feasibility of deploying the model on low-cost processors or microcontroller chips.The above results demonstrate that the proposed method is effective for the balance between accuracy and timeliness in the rapid species discrimination of insects.

Ablation studies
In this section, we conduct ablation studies to verify the effectiveness of the proposed method, including the feature regions selection, the SG smoothing method and the feature spectra selection.
Four datasets were used in the ablation experiment, including the original spectra of black area as shown in figure 4c, the original spectra of spot area, the original spectra of spot area with SG smoothing, and the key spectra of spot area with SG smoothing and BOI selection.We used spectra from four datasets as inputs to train the networks, with the same one-dimensional CNN structure.The training process and inference results of four networks to test sets are compared in figure 9 and table 2, to illustrate the actual effect of proposed method.

Feature regions selection
We can easily find the significant deficiency of non-feature regions for species identification by comparing figure 9a,b.The classification accuracy of black area spectra is only 76.00 ± 5.47%, and for some species, it is completely confused, as shown in table 2. Meanwhile, the accuracy of spot area is up to 91.64 ± 3.83%, demonstrating that the ROIs we selected can help improve discriminating ability.

Savitzky-Golay smoothing
With SG smoothing before inputting into the network, the accuracy of ROI spectra is improved to 92.36 ± 3.04%, with smaller variability as shown in figure 9c, demonstrating the positive effect of noise suppression.

Feature spectra selection
As shown in table 2, we find that the accuracy of the model based on 47 key spectra is 2.72% lower than that based on full spectra, indicating that the operation of BOI selection has removed some important spectral information.In addition, the amplitude of figure 9c is smaller than that of figure 9d, and the standard deviation of the key-spectral model larger.This is mainly owing to the complexity of the spectral features of the samples, which leads to the model being unable to learn effective information when the number of features decreases, resulting in a decrease in classification accuracy and poor model stability [38].Additionally, we also compared the on-chip inference results of full spectra and feature spectra, and the inference time was improved from 0.6 to 0.35 ms.
The above results confirm our expectations: compared with image-based models, spectral-based models exhibit significant advantages, with SG smoothing operations achieving the highest accuracy of 92.36% and the lowest standard deviation of 3.04%.Although the accuracy of the 1D-CNN-12 model has decreased after using feature wavelengths selection, its performance is still on par with the best image classification results, and its standard deviation performance is superior.When we use spectral information for classification, not only can we improve accuracy, but we can also significantly reduce the number of model parameters and complexity.This means lower computational costs and faster processing speed, which will have a wider range of application scenarios.

Conclusion
In conclusion, we propose a fast species discrimination method of insects outdoors using HSI and lightweight artificial intelligence algorithm.Feature regions selection, feature spectra selection and model quantification are used for the optimization of discrimination algorithm on chip.As experimental verification of field applications, six butterfly species in the same genus of Graphium are selected for rapid discrimination based on the lightweight CNN, which is deployed on a NPU chip.The experimental results show that the spectral classification model corrects the errors of the image classification model in the classification of similar species in Eurypylus and Meyeri.However, on two similar species, Doson and Chironides, the performance of the model did not achieve good results.In terms of overall classification results, the lightweight network extracted from hyperspectral features shows significant advantages, achieving accuracy comparable to machine vision shape recognition in just 0.35 ms of inference time.When we use full spectrum information, we achieve a highly stable and accurate model of 92.36 ± 3.04%.Although the inference time has slightly increased to 0.6 ms, it is still efficient and feasible in practical application scenarios.To further optimize the classification performance of similar species, we plan to combine spatial and spectral information from hyperspectral images in future research, in order to construct more accurate classification models.Our study provides a

Figure 1 .
Figure 1.The proposed method of rapid and accurate species discrimination.

Figure 2 .Figure 3 .
Figure 2. The samples of six butterfly species in the genus of Graphium.
the number of true positives, FP indicates the number of false positives and FN indicates the number of false negatives.The complexity evaluation metrics include floating point operations (FLOPs) and parameters, which were used to evaluate the computational power and memory consumption of the model.

Figure 4 .Figure 5 .Figure 6 .
Figure 4. (a) Three-dimensional data cube of the spectral image; (b) wing veins and compartments of Graphium genus; (c) the spectra of spot area and black area in the wings of Milon.

Figure 8 .Figure 9 .
Figure 8.(a) The used NPU chip of RK3588.(b) The conversion error of each layer and the cumulative error of all layers during quantization.

Table 1 .
The complexity evaluation results of different models.

Table 2 .
The training results of four models on the test set.species discrimination method for similar insects for field use and paves the way for the following research and applications of multi-spectral intelligent sensors based on on-chip spectroscopic microstructure integration. rapid